The genomic history of the indigenous people of the Canary Islands

The indigenous population of the Canary Islands, which colonized the archipelago around the 3rd century CE, provides both a window into the past of North Africa and a unique model to explore the effects of insularity. We generate genome-wide data from 40 individuals from the seven islands, dated between the 3rd–16rd centuries CE. Along with components already present in Moroccan Neolithic populations, the Canarian natives show signatures related to Bronze Age expansions in Eurasia and trans-Saharan migrations. The lack of gene flow between islands and constant or decreasing effective population sizes suggest that populations were isolated. While some island populations maintained relatively high genetic diversity, with the only detected bottleneck coinciding with the colonization time, other islands with fewer natural resources show the effects of insularity and isolation. Finally, consistent genetic differentiation between eastern and western islands points to a more complex colonization process than previously thought.


Lomo Galeón (Gran Canaria)
Lomo Galeón is a burial site on the southwestern coast of Gran Canaria. It is situated on the left margin of the ravine just above the Bahía of Santa Águeda. The site comprises several sets of gravestone cysts excavated in the 1940s and 1980s 8 . The archaeologists found four individuals, including three men and one woman 30 . The three men presented auricular exostosis, a condition 5 related to swimming in cold/temperate waters 31 . The site has already been dated between 1250 -1290 cal AD (95% probability) based on the analysis of one of the individuals 32, 33 . The four individuals from Lomo Galeón are included in the present study, although paleogenomic analysis point to them as being two males and two females (instead of three males and one female as determined by anthropological analyses). All remains have been consistently dated between the 10 12 th and 14 th centuries cal AD (95% probability) (CAN.021, CAN.022, CAN.023, and CAN.024).

Los Pasitos (La Palma)
Los Pasitos is an archaeological complex situated on the eastern side of La Palma 15 comprising open-air and cave dwellings, rock-art stations, and a burial cave. A disturbed funerary deposit was recovered by local archaeologists. This site is located nearby significant archaeological sites such as Roque de Los Guerra and Belmaco. In this study, we include a male individual from the site of Los Pasitos (CAN.035) radiocarbon dated between 1157 -1264 cal AD (95% probability). 20

Montaña Mina (Lanzarote)
The collective burial cave of Montaña Mina is located in the central area of Lanzarote on the eastern side of a volcano. Montaña Mina was excavated in 1979 and represents the first indigenous burial in Lanzarote to be systematically excavated 34 . The cave is placed in an area 25 where significant archaeological sites are located, including La Atalaya, Lomo de San Andrés, Zonzamas and Fiquinineo. In this study, we include two male individuals from Montaña Mina. These individuals are radiocarbon dated between 1045 -1222 cal AD (95% probability) (CAN.037) and 1162 -1264 cal AD (95% probability) (CAN.038). 30 Puente de la Calzada (Gran Canaria) Puente de la Calzada is a burial cave located in the Guiniguada ravine in the central-eastern area of Gran Canaria. The site was discovered in 1993 and excavated in 2002. The cave contained seven individuals that were placed in two different periods. The human remains from the later episode were found in a primary position, while those from the earlier burial were 35 mainly recovered in a secondary position 35 . We included two individuals from Puente de La Calzada, one male and one female, with radiocarbon dates between 1265 -1388 cal AD (95% probability) (CAN. 26), and 1303 -1410 cal AD (95% probability) (CAN.025), respectively. 40 Punta Azul is a burial site located in a volcanic tube at the cliff of the southwestern coast of El Hierro. The site of Punta Azul is part of a large sepulchral area composed of dispersed natural caves that worked as collective burials. Among them, we can find some of the most important indigenous burial places, such as Montaña de La Lajura, Cueva de la Ballena or Letime 36,37 . The site was discovered in 1947 by the archaeologists Álvarez Delgado and Diego Cuscoy. They 45 identified six individuals in a primary position and other human remains in a secondary/disturbed position. They only recovered those elements that were considered relevant at the time. The recovered materials included five complete and five fragmented skulls, 18 mandibles, ten femurs, a piece of fur from a mortuary shroud, a lithic instrument, and a goat horn 39 . This site was re-excavated in 1994. They found an intensively disturbed burial where human remains were totally disarticulated in a secondary/disturbed position. More than 6,000 bone specimens were collected, including bones, bone fragments and teeth 40 . Some studies on the Punta Azul site have been focused on performing paleopathological 41-45 , paleodietary and paleo-nutritional studies 46-5 48 , as well as the establishment of discriminant functions for the tibia 49 , and the analysis of the particular characteristics of this population and its comparison with other contexts 44,50,51 .Two radiocarbon dates have already been obtained for two individuals from Punta Azul: 1022 -1159 cal AD (95% probability) and 1040 -1214 cal AD (95% probability) 36 . Four individuals from Punta Azul are included in the present study. Two are male and two are female (CAN.001, 10 CAN.002, CAN.003, and CAN.004). Radiocarbon dates were obtained for these individuals yielding a chronology between the 13 th and 15 th centuries cal AD.

Salitre (Tenerife)
Salitre is a burial site located in a volcanic crevice on the northern side of Montaña Rajada 15 at the volcanic caldera of Las Cañadas (El Teide National Park). The site has been known since the 19 th century, but archaeological fieldwork was not performed until 1945. Human remains and fragments of leather shrouds and clothes were recovered from this site 39 . The bodies were in supine decubitus over a two-layered bed mostly made of stone and vegetables. There is evidence of planks and forks used to make floorboards and even shelf systems that could hold standing 20 bodies 39,52 . The human remains show excellent preservation due to the natural presence of natron or salitre in the cave's walls. There is an ongoing debate on the minimum number of individuals buried here, ranging from 50 to 300. Archaeological excavations have identified a dozen individuals, male and female, and from all age groups, including children between four and seven years of age 17,53 . There are two already available radiocarbon dates for this site: 1045 - 25 1282 cal AD (95% probability) and 1517 -1656 cal AD (95% probability) 2 . This study includes one male radiocarbon dated around 1024 -1155 cal AD (95% probability) (CAN.048).

Salto Casimiro (La Palma)
The burial site of Salto de Casimiro is located in the Hermosilla ravine in the central western 30 area of La Palma. The site was accidentally discovered in the 1970s during road construction. Unfortunately, these road works destroyed a significant part of the necropolis. However, an archaeological intervention was performed in the preserved area of the site (Nuria Álvarez Rodríguez, personal communication). We included a sample from one female radiocarbon dated between 579 to 652 cal AD (95% probability) (CAN.052). 35

Supplementary Note 2: Endogenous content and enrichment results
Individuals with high endogenous DNA content (>10%) and post-capture libraries were sequenced to saturation on an Illumina NextSeq 500 platform (paired-end reads, 2 x 75 bp and 2 x 42 bp). Mapping and filtering were performed as in 54 . Briefly, reads were trimmed and 5 adapters removed using AdapterRemoval version 1.5.4 55 , with a minimum insert size of 30 bp and a minimum base quality of 20. Paired-end reads were then merged with a minimum overlap length of 11 bp. Trimmed merged reads were then mapped to the human reference genome (GRCh37) using BWA version 0.7.12 56 . For libraries sequenced using the 2 x 42 bp paired-end method, unmerged reads up to 141 bp were also kept for analysis, replicating the same insert size 10 as 2 x 75 bp paired-end merged reads. For libraries sequenced using the 2 x 42 bp paired-end method, we also used the clipOverlap function from bamUtil v.1.0.14 to trim overlaps smaller than 11 bp on paired-end reads 57 . The mapping was performed using "bwa -aln" with the seed option (-l) disabled. After mapping, we removed reads with a mapping quality lower than 30, duplicated reads and reads with alternative mapping coordinates. All filtering was performed 15 using SAMtools version 0.1.19 58 . Bam files from different runs were merged per individual using SAMtools merge. The amount of endogenous DNA for shotgun and capture libraries was calculated by dividing the number of reads after filtering by the total number of trimmed reads. To normalize results, a subset of 2 million raw reads was used for comparison between individuals. All figures 20 were produced using R version 3.6 59 and the package ggplot2 60 .

Shotgun sequencing
Endogenous DNA content was highly variable between and within archaeological sites. 25 DNA conservation was relatively good for some of the indigenous individuals, with six individuals from four different archaeological sites showing rates above 30% (Supplementary Data 1; Supplementary Fig. 1). Overall, endogenous DNA in the shotgun libraries accounted for 10.7 ± 3.8% of the total (median = 4.6%, IQR = 1.5% -17.7%). Complexity of the libraries was also good, with an average duplicate rate of 0.66 ± 0.12% (Supplementary Data 1; 30 Supplementary Fig. 2).

MEGA capture
Ancient DNA libraries were also captured for SNPs contained in the Illumina Multiethnic 45 Genotyping Array (MEGA) array. MEGA capture produces a relatively good increase of coverage on the targeted SNPs (Supplementary Data 1), with an average increase of intersected

Supplementary Note 3: Authentication criteria
Authenticity of the data was assessed with MapDamage version 2.0.2 61,62 to identify the presence of damage associated with cytosine deamination and fragmentation, and with ContamMix version 1.0-10 63 and Schmutzi 64 to calculate contamination rates. For contamination 5 analysis, we used the mtDNA bam files filtered using the same pipeline as for nuclear DNA (see Supplementary Note 2), and with 3 bp trimmed at both ends to avoid damage interfering with contamination estimations. Simultaneously, we assessed for the autosomal contamination using the non-pseudoautosomic X-chromosomal data on male individuals. For that purpose, we used ANGSD 65 on X-chromosome polymorphic sites using reads with a base quality >20 and 10 mapping quality >30. Post-mortem damage patterns were as expected for ancient DNA. Deamination at the 3' ends of reads ranged between 6.12% and 44.14%, with an average value of 18.14 ± 2.68% (Supplementary Data 1; Supplementary Fig. 5). Average insert size was 60.99 ± 3.51 bp, with values ranging between 39.18 and 85.79 bp (Supplementary Data 1; Supplementary Fig. 6). 15 As published before for most of the individuals of these dataset 66 , contamination rates are low accounting for an average value of 1.95% ± 0.54% estimated with ContamMix and 1.52% ± 0.16% with Schmutzi ( Supplementary Fig. 7). Most of the human remains also show low levels of autosomal contamination, except for five males who display 6.1% to 15.2% of contaminated reads (Supplementary Data 1). However, as ANGSD contamination estimations on ancient low 20 coverage (<0.1X) data are problematic, we decided to keep the three individuals with autosomal contamination levels >5% and <0.1X for further analysis. Also, as the previously published gun002 and CAN.027 have contamination estimations of 6% and 8.2%, respectively, and their mtDNA contamination estimations are very low, we decided to keep them but considering their status. 25 5 The molecular sex of the individuals was identified using the ry estimate 67 and results were plotted using the R programming language (version 3.6) 59 and the package ggplot2 60 . As observed before 54 , we needed to filter pseudo-autosomal and repetitive regions prior to ry calculation to accurately assess the molecular sex of captured individuals ( Supplementary Fig.  8). Including previously published genomes from 68,71 , 21 individuals are females and 28 are 10 males (Supplementary Data 1, Supplementary Fig. 8).

Y-chromosome haplogroup classification
Male individuals were subjected to Y-chromosome analysis as in 54 . First, we performed a 15 SNP calling based on the ISOGG Y-DNA Haplogroup tree 2019-2020 database (v.15.73) using samtools mpileup, filtering out bases with BASEQ < 30. The resulting variants were classified in two groups based on the putative presence of DNA damage (C à T or G à A) and were visually classified among the main haplogroups following the Y-chromosome phylogenetic tree 69 . Additionally, individuals were classified using pathPhynder's "best path" method 70 . All 20 figures were produced using R v. 3.6 59 and the R packages ggplot2 60 and ggtree 70 . Including previously published genomes from 68,71 , a total of 28 ancient individuals from the Canary Islands were identified as males. Their geographic adscription is as follows: El Hierro (n = 2), Gran Canaria (n = 12), La Gomera (n = 2), La Palma (n = 1), Lanzarote (n = 2) and Tenerife (n = 9). No males were observed for Fuerteventura; therefore, no information is 25 available regarding Y-chromosome lineages present in the ancient population of this island. Overall, the most frequent Y-chromosome haplogroup in the indigenous population of the Canary Islands is E-M183 (57.1%), followed by T-M184 (21.4%), E-M33 (10.7%) and R-M269 (7.1%). Finally, E-M78 was observed at a frequency of 3.6% (Supplementary Fig. 9; Supplementary Data 2). 30 Initially, information on the Y-chromosome composition of the indigenous population was obtained using classical methodologies based on PCR 36,72 . Although E-M183 was not genotyped, a SNP for the E-M81 branch was included. As E-M81(xE-M183) is extremely infrequent 73 we can consider all E-M81 males to belong to the E-M183 sublineage. In the 2009 study 72 , E-M183 (E-M81) was also the most common Y-chromosome lineage, although in that case the frequency 35 was estimated to be 26.7%. The frequency for E-M33 (10.7%) was also estimated to be lower when using PCR methods (3.3%). On the other hand, the frequency of E-M78 was higher in the previous study (23.3%) than in this NGS analysis (3.6%). Moreover, haplogroups I-M170, J-M267 and P-M45* were observed by Fregel et al. 72 (Lanzarote). E-M33 is considered to be a sub-Saharan lineage whose highest frequencies (~50-60%) have been observed in South and Central Africa 74,75 . E-M33 is also present in current populations in North Africa, reaching its peak in West Sahara (9.0%); however no complete sequenced E-M33 Y chromosome is available from this region. Considering E-M33 is a lineage of sub-Saharan origin, the presence of E-M33 in the indigenous population of the Canary Islands confirms the impact of sub-Saharan migrations into 5 North Africa prior to ~2,000 years ago. This result is consistent with sub-Saharan mtDNA lineages such as L1 or L2 observed in the Canarian native people 66

Haplogroup R-M269 20
Two indigenous individuals belong to the R-M269 haplogroup (Supplementary Fig. 9; Supplementary Data 2): one from Punta Azul (El Hierro) and one from Guayadeque (Gran Canaria). R-M269 is the most common haplogroup in Western Europe, although it is also found in North Africa in lower frequencies 77  MtDNA evidence suggests that eastern and western islands could have a different genetic composition 66 , which could account for this variation in the placement of ancient people from the Canary Islands. To test that hypothesis, we plotted the indigenous people using a different color code for each island and we observe that western islands are clustered closer to Upper Paleolithic and Early Neolithic North Africans, while eastern islands are placed closer to Sardinians and 5 other modern European populations ( Supplementary Fig. 15). There is only one exception to the eastern/western distribution: one individual from Tenerife (CAN.039) that is placed within the eastern islands' cluster. Some of the archaeological sites from Tenerife we included in this study belong to the post-contact period. Individual CAN.039 belong to the archaeological site of La Angostura, a site with clear indigenous funerary practices and material culture 17 . Radiocarbon 10 dating from this site indicates this burial was used between 1,295 and 1,414 AD 17 . After performing RCD on this particular individual, results indicate that CAN.039 was dated between 1,396 -1,447 cal AD. As contact with European sailors and explorers started in Tenerife in the 13 th century, it leaves the possibility of European admixture or that this individual came from a different island. 15 Although aDNA libraries have been subjected to enrichment techniques, a limitation of our study is the low coverage of some individuals (Supplementary Data 1), which makes us question if sample clustering in the PCA can be accurate. The advantage of using LASER is that sequence data for each reference individual is simulated to match the coverage pattern of each individual. Then a PCA ancestry map is built based on the simulated sequence reads for both the reference 20 and the ancient individuals. Finally, the ancestry map is projected into the PCA space built based on the whole reference panel. The stochastic variation introduced by the simulation procedure of the LASER method can lead to slightly different results in the placement of each individual. For that reason, our analysis was based on the average of 10 independent replicates. This approach allows us to better place each individual into the PC space, but also, by plotting the result 25 obtained in 10 replicates, we can test how coverage correlates with uncertainty in the placement of a particular individual. For that, we compared the replicates of individuals with different coverage rates in our dataset ( Supplementary Fig. 16). Placement in the first component is stable through replicates, even for individuals with very low coverage (~3%). Coordinate values in PC2 fluctuate slightly in low-coverage genomes (~10% of the SNPs covered) and increase in 30 individuals with lower values (~3%). However, although this phenomenon can account for some uncertainty on the placement of the individuals in PC2, it is not enough for accounting for the differentiation between the two clusters of eastern and western Canarian people.

Human Origins panel 35
The indigenous population of the Canary Islands was also compared to other available modern and ancient populations using the Human Origins panel and the lsqproject option from smartPCA 100 . To this end, we compared the Canarian natives to ancient and modern populations from North and sub-Saharan Africa, the Middle East and Eastern, Southern and Western Europe. 40 In this PCA, the first component separates sub-Saharan African populations from the rest, while the second component accounts for differentiation within sub-Saharan Africa ( Supplementary Fig. 17 Supplementary Fig. 19).

Global ancestry 25
For unsupervised clustering analysis, we used the same datasets as for lsqproject PCA, but pruning for linkage disequilibrium using PLINK v1.90 105 , with parameters --indep-pairwise 200 25 0.4. In addition, the global ancestry of ancient individuals was determined by unsupervised clustering using ADMIXTURE software v1.3.0 106 . The analysis was performed in 10 replicates 30 with different random seeds, and only the highest likelihood replicate for each value of K was taken into consideration.

Human Origins panel
In Supplementary Fig. 21 and Supplementary Fig. 22, we show ADMIXTURE results for the modern and ancient Human Origins panel, respectively, with K values ranging from 2 to 8. 10 As  54,102 . Also, some individuals have a component associated with Neolithic Iranians (grey) that is present in ancient and modern populations from the Near East but also in Bronze Age people from the steppe. Although from a mitochondrial DNA perspective, a sub-Saharan African input in the indigenous 25 people of the Canary Islands is inferred from the presence of L haplogroups, the unsupervised clustering analysis only shows a small sub-Saharan contribution (red) in a few individuals, and in all cases, the observed values are below 5%. Interestingly, some differences are observed between western and eastern populations. At K=8, individuals from the western islands show a higher proportion of the autochthonous 30 Maghrebi component and a lower contribution associated with steppe populations when compared with eastern islands (Fig. 2b). This differentiation can be observed both comparing the different islands using a ternary plot depicting the main components present on the indigenous population ( Supplementary Fig. 23) and a boxplot showing the contribution mean of each component per island ( Supplementary Fig. 24). On the other hand, the contribution of the 35 European Neolithic and the Near Eastern components are similar in both regions ( Supplementary  Fig. 24).
When compared to the current population of the Canary Islands, an increment is observed in the component likely associated with the Steppe contribution and the Near Eastern component, and a lower autochthonous North African contribution (Fig. 2b). This result is expected giving 40 the fact that the modern population of the Canary Islands is admixed with the main parental populations being the indigenous people and European colonizers. Giving the time of the human colonization of the Canary Islands, it is also interesting to compare them to modern North Africans, to get an insight of the impact that later migrations had on the region (Fig. 2b). Modern North African people exhibit a lower proportion of the ancient 45 Maghrebi component when compared to the Canarian natives. As this component is higher in western North Africa 101 , the greatest differences are observed for Egypt and Libya. We also observe a greater contribution from the Near Eastern and Sub-Saharan African components in modern North Africans. These differences correlate to additional migration reaching this area in the last two millennia, including the Arab expansion in the seventh century and the trans-Saharan migrations.
The high proportion of the Maghrebi component observed in the Canary Islands indigenous people is indicative of their origin being related to Berber populations in North Africa, a 5 hypothesis that has been supported by extensive archaeological, linguistic, and genetic analyses 66,73,[107][108][109][110][111][112][113][114][115][116][117][118][119][120][121][122] . The inclusion of other historical individuals in the PCA allows us to test some controverted hypotheses that have been proposed for the origin of the indigenous people (Supplementary Fig. 19; Supplementary Fig. 23). Based on radiocarbon data obtained from bulk sediment samples in the archaeological site of Buenavista in Lanzarote 112 , some archaeologists 10 have proposed chronologies that go as far as the 5 th century BCE 112-114,121 , leading them to propose a Phoenician origin for the first settlers of the Canary Island 112-114,123 . However, a recent review of this data and the application of radiocarbon hygiene criteria have shown that these older dates for the colonization of the Archipelago are not reliable 109 . Furthermore, the archaeological record show no clear cultural adscription with Phoenicians. Specifically, 15 archaeological evidence is composed of small pieces of pottery from non-diagnostic regions and excavated in a site with a questioned stratigraphy that cannot discard the intrusion of materials from other periods. For those reasons, scholars have contested the affiliation to this site to the Phoenician culture 115 . Regarding an Egyptian origin for the indigenous people of the Canary Islands, although some scholar traditions up to the XIX century proposed a link between Ancient 20 Egypt and the islands based on the practice of mummification in both regions, archaeologists promptly disregarded it. However, this association between Egypt and the Canary Islands has remained in the collective imaginary 116 . As for the hypothesis on the Roman colonization of the islands, a few aspects should be taken into consideration. The purple dye workshop in the islet of Lobos is the only site with a clear Roman adscription. However, its archaeological record shows 25 no interaction between this population and the Canarian indigenous people 110,111 . In this sense, absolute radiocarbon dates from this site point to the 1 st century BC, previous to the proposed date of the colonization of the islands (between the 2 nd and 5 th centuries AD). Also, the materials found in this site, especially Roman amphoras, provide a relative chronology that situates this settlement in a relatively short period (between 50 and 100 years), showing its limited impact on 30 the population of the Canarian Archipelago as a whole 33,109 . Based on the ADMIXTURE results observed for the indigenous people and the historical individuals included (ancient Egyptians, Phoenician and Etruscan), the natives of the Canary Islands have a higher ancient North African component than Egyptians and the Phoenician individual from Ibiza ( Supplementary Figures 23 and 24). Also, their percentage of the 35 autochthonous Maghrebi component is higher than the Etruscan individual with North African admixture published by 97 . Egyptian individuals are characterized by having a higher Iran Neolithic contribution, while the Phoenician and Etruscan individuals have a higher steppe component than most of the individuals from the Canarian archipelago. Based on those results, it is unlikely that the origin of the natives could be solely related to a single migration event from 40 these populations.

Supplementary Note 6: f-statistics analyses
We performed the analysis of outgroup f3-statistic using qp3Pop program from ADMIXTOOLS (https://github.com/DReichLab/AdmixTools), to determine the amount of shared drift between two ancient populations PopA and PopB. For that, we chose an outgroup 5 population that has not experience any post-divergence gene flow with either PopA or PopB as the target population (e.g. Jo'hoan North), and calculated the f3-statistic in the form f3(PopA, PopB; Outgroup). In this way, the result of the f3-statistic will be a positive value proportional to the length of the shared drift path of populations PopA and PopB with respect to the outgroup. Considering the populations contained in the Human Origins dataset, the indigenous people 10 of the Canary Islands share more ancestry with Iberian, Aegean and other European Early Neolithic populations ( Supplementary Fig. 25 islands show a higher ancient North African contribution. To test if differences are observed between islands, we repeated the outgroup f3-statistic calculations at insular level and compared the amount of shared ancestry with populations of interest. Differences were observed between the amount of shared ancestry between islands populations with Neolithic and Bronze Age populations from Europe, with the eastern islands producing higher values. Figure S26 shows the 20 results obtained for Middle Neolithic to Bronze Age populations from Western Europe and Late Neolithic to Bronze Age populations from the steppe. When compared with Early and Late Neolithic individuals from North Africa, slightly differences between eastern and western islands are observed for the Late Neolithic population, but not for the Early Neolithic population. However, outgroup f3-statistics are difficult to compared and do not provide information about 25 the contribution of different populations, and a more detailed analysis using admixture modeling is needed.

Supplementary Note 7: qpAdm modeling
Admixture modeling was performed using qpAdm from ADMIXTOOLS, following 124 and using the Allen Ancient DNA Resource curated by the Reich Lab at Harvard (https://reich.hms.harvard.edu/allen-ancient-dna-resource-aadr-downloadable-genotypes-present- 5 day-and-ancient-dna-data, version 42.4). First, we defined the target as the whole population of the Canarian archipelago. For the right populations, we chose the same outgroups as in 87  It is worth mentioning that models with Iberia_BellBeaker do not reach significance. Then, we applied the best four 4-stream models to determine differences between a) the two geographical regions (eastern and western islands), b) each island population (El Hierro, La Palma, La Gomera, Tenerife, Gran Canaria, Fuerteventura and Lanzarote) and c) each individual. 5 When comparing the eastern and western regions (Supplementary Data 5), we observe that western islands have a higher contribution of Morocco_EN (8.3% ± 1.1%) than eastern islands (4.9% ± 1.1%). Also, the Germany_BB contribution is slightly higher in the eastern islands (16.0% ± 2.0%) than the western islands (11.4% ± 1.9%), while the rest of the components remain relatively stable. If we look at each island by itself, we observe that La Palma has the 10 highest Morocco_EN contribution ( analyzed in combination with radiocarbon data, we observe that the slight differences between eastern and western regions are observed for the entire period of indigenous occupation ( Supplementary Fig. 27). This result is more clearly seen for the two islands with a larger sample size: Tenerife and Gran Canaria. When the Morocco_EN contribution is compared with radiocarbon data obtained from the individuals, we observed that values are relatively lower in 25 Gran Canaria for the entire indigenous occupation period ( Supplementary Fig. 28).
Although the presence of a steppe component in the indigenous people of the Canary Islands can be explained based on a Bronze Age migration to North Africa, it can also be the result of the impact of Punic or Roman populations in North Africa before the settlement of the archipelago. To test this hypothesis, we repeated the four-stream qpAdm modeling but replacing 30 German One limitation of our study is that sample sizes are low for some islands, for example, 45 Lanzarote and Fuerteventura with only two individuals. To account for sample size differences, we compared the qpAdm admixture values obtained at individual level by island ( Supplementary  Fig. 29). Mean values for the contribution of Early Neolithic Moroccans to western islands (8.60% ± 0.57% for El Hierro, 7.67% ± 1.91% for La Gomera and 8.82% ± 1.09% for Tenerife) was higher than that of those of the eastern island of Gran Canaria (5.54% ± 1.20%). The only overlapping value between the western and eastern regions is the one calculated from the individual CAN.027 from La Gomera, but never reaching values as low those observed for the 5 islands of Lanzarote and Fuerteventura. Hence, although some variation is observed within individuals from the same island, a regional differentiation between eastern and western islands is confirmed. Family relationships between individuals were estimated using READ software 126 based on the MEGA-HGDP genome-wide data. This method allows us to infer up to second-degree relationships, including nephew/niece-uncle/aunt, grandparent-grandchild or half-siblings. This approach is especially appropriate for kinship inferences on ancient DNA data, as it can produce reliable results for individuals with coverage greater than 0.1X. To avoid lack of coverage to 10 confound the estimations, we excluded CAN.043 from Tenerife and CAN.046 and CAN.039 from Gran Canaria (<0.2X covering the MEGA-HGDP dataset). We only detected two individuals from Tenerife (CAN.045 and CAN.046) with an average pairwise P0 coinciding with a second-degree relationship ( Supplementary Fig. 30). The rest of the individuals from the islands showed no apparent kinship between them. 15

Heterozygosity estimations
Heterozygosity estimates were obtained for each island population and for each archaeological site using popstats 127 Fig. 32). These results confirm previous diversity estimations based on uniparental markers 66 in which islands with more exploitable natural resources (Tenerife, Gran Canaria and La Palma) had more diversity than smaller islands or islands with more limited 40 resources.
To account for sampling bias, heterozygosity values were estimated for all possible combinations of two individuals within each island (the lowest sample size, from Fuerteventura and Lanzarote) and the average value was calculated. To mirror the individuals of Lanzarote and Fuerteventura, we selected for heterozygosity estimations one individual with a low-coverage 45 genome available and one captured individual. Again, heterozygosity estimations are variable depending on the individuals involved in the calculation, but mean and median values for the islands of Tenerife and Gran Canaria are higher than those observed for El Hierro and La Gomera (Fig. 3a). In order to explore the variation on the heterozygosity estimations, we focused on the better characterized islands of Tenerife and Gran Canaria ( Supplementary Fig. 33). Variation on the heterozygosity values estimated by pairs of individuals can be explained by low overlap within the two compared individuals, with lower coverages tending to produce lower 5 heterozygosity estimations, with a moderate correlation between the pi value and coverage (Tenerife: r=0.449 and p=0.0020; Gran Canaria: r=0.510 and p=0.0078). However, no combination of two individuals produced values as low as those observed for Lanzarote (pi = 0.193) and Fuervententura (pi = 0.184). Even though these values could be an underestimation of the real values for Lanzarote and Fuerteventura, the results obtained point to a lower 10 heterozygosity value for these islands, as observed for El Hierro and La Gomera.

Inbreeding estimations
The presence of runs of homozygosity (ROHs) was detected using hapROH 128 for both the 15 Human Origins and the MEGA-HGDP datasets. In this method, haplotype information from a modern DNA panel is used to detect ROHs longer than 4 cM in aDNA genomes with coverage as low as 0.3X. We screened for the total sum of ROH segments in four length ranges (4 -8, 8-12, 12-20 and >20 cM) in individuals with genome coverage higher than 0.3X. First, we focused on short ROH segments of 4 -8 cM (ROH [4,8]) that accumulate from 20 parents on 1 -30 generations ago and are indicative of the population size of the ancestral mating pool (background relatedness) over approximate the previous half millennium 128 . ROH Supplementary Data 8), which is indicative of relatively small effective population in the past, probably on the earliest stages of the human settlement. Second, we tested our individuals having >20 cM ROH (sROH>20cM) segments greater than 50 cM, which is indicative of small and isolated populations 128 . We identified five ancient individuals exceeding this long ROH threshold: CAN.004 from El Hierro, CAN.005 from 30 Fuerteventura, CAN.030 from La Gomera, CAN.038 from Lanzarote and CAN.038 from Tenerife. Interestingly, the four of them with the greatest sROH>20cM and total ROH exceeding 300 cM are from either islands with the smallest areas and/or with poorer resources. This amount of ROH can be due to sporadic close-kin matings due to small population sizes. Also, all individuals are from the latest stages of the indigenous period of the Canary Islands (from 12 th 35 century onwards) which can be indicative of the effect of isolation in the archipelago population.

Effective population size
Estimates of effective population size (Ne) for insular populations were obtained using 5 hapROH 128 for HGDP-MEGA dataset as it is the one with a better coverage (Fig. 4a) Data 9). We observed that for all the islands, the Ne confidence intervals estimates are considerably wider than when using the original sample size. As expected, we also observed that for different individuals of the same island, Ne estimates based on one individual vary depending on the proportion of his/her genome on ROHs. For all the above, although the inbreeding level observed in the only two available individuals from Lanzarote and Fuerteventura are a testament 40 of the small Ne of these island populations, estimates for these islands should be reevaluated when additional paleogenomic data is available.

Founder effects 45
To evaluate the history of reductions in population size in the indigenous populations of the Islands, we used ASCEND 130 . ASCEND relies on the decay of the allele sharing correlation across individuals in a population to infer the time and intensity of a founder/bottleneck event to low-coverage genomes. We followed the recommendations from the authors and ran the analysis for four different datasets from the MEGA-HGDP panel: (1) all indigenous individuals, (2) individuals divided by their geographical location (eastern -Lanzarote, Fuerteventura and Gran Canaria-and western -Tenerife, La Gomera, El Hierro, La Palma-); and (3)  characteristic with the Archipelago's. However, this analysis can also produce artifactual results as different islands could have experienced founder effects of different intensities. If we consider Gran Canaria alone, a possible bottleneck event is estimated between the 4 th century BCE and the 2 nd century CE, similar to the eastern's potential bottleneck in intensity and time range (95% CI: 2.3% -2.9%). This result is expected because most of the eastern people 35 are from Gran Canaria. Alternatively, individuals from Tenerife could have experienced a more drastic event (95% CI: 3.7% -4.7%) than the western people but in the same period (95% CI: 242 CE -66 BCE), similar to those seen in present-day islanders 130 . When we filtered individuals based on their SNP coverage, we do not see significant differences in the bottleneck events' dating, except for Gran Canaria and the eastern dataset 40 ( Supplementary Fig. 34). As Gran Canaria is the island from which we have sampled a more ample territory (all of the individuals from Tenerife are from a very restricted area of the island) this result may not be due to coverage itself but to sample diversity. Even though the minimum recommended sample size N≥5, as we see that coverage and sample size does not seem to affect results in islands such as Tenerife, we ran the analysis with 45 the El Hierro individuals. In fact, although there are four individuals, only one of the genomes is below 1X of coverage. Using that dataset, we detected that El Hierro individuals shared a bottleneck event occurring between the 8 th and the 12 th centuries CE, few generations before the individuals lived. The intensity of the event is ~12%, considerably greater than the rest of the events studied before. However, the fact that all of them are from the same archeological site and close in time could be affecting the results. More genome-wide data from El Hierro will be needed to determine if this event affected all the island population. 5

Supplementary Note 10: Phenotype analyses
The Canarian indigenous genomes were assessed for phenotypical traits by analyzing the alleles present at a population level in several loci of interest (Supplementary Table 1). For that, we performed SNP calling using samtools mpileup, filtering out low-quality bases (BASEQ < 5 30) and low-quality mapped reads (MAPQ < 25). Due to the low coverage of the genomes, we only considered one random allele for each position for all the variants (pseudo-haploid calling) and studied the variants present in the indigenous pool with depth > 1. First, we assessed the five known MCM6 SNPs associated with lactose tolerance (rs4988235G>T, rs41380347A>C, rs41525747G>C, rs145946881C>G, rs869051967A>C) 131 . 10 All indigenous individuals carried the ancestral alleles for all the variants, except for one individual with two alleles for the rs4988235A variant (Supplementary Table 1). Probably the most common phenotype in the Canary Islands indigenous people was lactose intolerant as previously seen 68 .
We also reviewed the bibliography and considered several medical conditions that are 15 common in present-day Canarians when compared to the Iberian Peninsula, in order to prove if their associated variants were already presented in the indigenous gene pool (Supplementary Table 1). The PLCG2 gene has been associated with autoimmune conditions such as asthma in present-day Canarians, at high frequency when compared with the Iberian founder population, and has been linked to their North African ancestry 132 . We found that the derived variant 20 associated with asthma in this gene (rs3852738T) is found in a frequency of 79.2%, so the indigenous origin proposed by the authors is plausible.  Fig. 36). The sub-Saharan contribution is low in all the islands, with the highest value observed in La Gomera (4.0% ± 0.5%) and the lowest in La Palma 5 (1.1% ± 0.5%) (Supplementary Fig. 36). When insular indigenous populations are considered for calculations, the contribution of the native people is higher in La Gomera, Lanzarote and Fuerteventura, although we cannot rule out the possibility that this result could be due to low sample sizes, as the two islands with the larger numbers of individuals (Tenerife and Gran Canaria) produce similar results in both analyses. 10 Regarding of the method used for obtaining the admixture proportions for the modern Canarian population, it is clear the impact that the European colonization had in the indigenous people. Although the contribution from the mtDNA is more than 50%, this value is lower than 5% when the Y-chromosome is considered. Genome-wide data point to an overall contribution around 17.8% ± 0.8%, similar to the value obtained from 68 (16.7% ± 5.0%). As was previously 15 observed for the mtDNA, the contribution of the indigenous people to current Canarians is variable between islands. From the Y-chromosome, we observe than the contribution is almost zero for the smallest islands of El Hierro, La Palma and La Gomera. Higher values are observed for the larger islands of Gran Canaria (9.1%) and Tenerife (8.2%). It is interesting that the island of Fuerteventura has a contribution of 12% while Lanzarote, with a similar conquest and 20 colonization history, has a value of 1.2%. As the Y-chromosome composition was most probably variable within the archipelago as observed for the mtDNA, all these analyses would need to be repeated when enough Y-chromosome data is available to perform calculations at an insular level. Regarding genome-wide analyses, it is expected from historical data that the islands that have received more migration since colonial to modern times, are the ones with the lowest 25 indigenous and the highest European contributions. This includes the two islands with the capital cities of the provinces of Las Palmas de Gran Canaria and Santa Cruz de Tenerife, but also the island of La Palma. The island of La Palma and Tenerife had an important arrival of European populations, mainly Portuguese and Castillians since the beginning of the Modern period. Although the sub-Saharan African contribution is low, this result is also interesting as it 30 demonstrates the impact of the slave trade in the Canary Islands and the incorporation of freed enslaved people into the Canarian society as demonstrated by 138 .

Supplementary Note 12: IBD analyses
We imputed the genomes using GLIMPSE v1.0.1 139 with the default parameters as recommended by 140 . Given that 0.1X coverage is enough to get 80% of imputation accuracy at common variants (minor allele frequency >= 10%), only >0.1X individuals (N=22) were selected 5 for the imputation step (Supplementary Data 1). We first computed genotype likelihoods for each genome using the candidate variant sites contained in the 1000 Genomes phase 3, filtering out for non-biallelic and singleton variants. We split the chromosomes into 2 Mb chunks using 200 Kb of buffer size with GLIMPSE_chunk. Each chunk was then imputed and phased using GLIMPSE_phase and ligated into the same chromosome using GLIMPSE_ligate. The most 10 likely haplotypes (--solve parameter) were then sampled from the ligated VCFs. We filtered out the individuals with an average genotype probability (GP) lower than 0.95 and removed variants with an INFO score <0.5.
To analyze the population structure among ancient Canarians and between them and other ancient European and African populations in finer detail, we identified shared genomic segments 15 that are identical by descent (IBD Finally, we carried out genetic clustering of the ancient individuals using hierarchical community 25 detection on a network of pairwise IBD-sharing similarities, as in 140 . Briefly, we used igraph 147 to build a weighted network of the individuals using the fraction of the genome shared IBD between pairs as weights, and used this network to perform community detection using the Leiden algorithm 148 implemented in the leidenAlg R package (https://github.com/kharchenkolab/leidenAlg). 30 We observe that a single genetic cluster for all the African individuals was created separating them from European people (Supplementary Fig. 37). It is worth mentioning that this cluster does not contain the Late Neolithic people from Morocco, that are placed with Chalcolithic and Early Bronze Age individuals from Europe. The African cluster identifies five communities: three for the Canary Islands people and two for the rest of the ancient North 35 African and sub-Saharan African individuals. Focusing on the Canarian ones (N=22), we observe that this analysis separates all individuals from the eastern islands from those of the western ones, except for El Hierro individuals who are placed in a specific cluster (Fig. 4b), as expected for individuals who share a high fraction of their genome in IBD and consistent with consanguinity due to isolation. When focusing on the western cluster, we observe that, as 40 expected in isolated insular environments, individuals from the same island share a higher fraction of their genome in IBD, clustering them together. One of the cluster groups La Gomera and La Palma individuals separated from the Tenerife individuals, and then each island in their own cluster. In the case of the eastern islands, the individuals seem to be more difficult to group together than the western ones, as the fraction of the genome shared IBD within pairs is lower. 45 Although it is possible to group the Fuerteventura and Lanzarote individuals separated from the ancient Gran Canarians, the difference is not as significant as in the western people. We can observe that those individuals are more related to CAN.008 (Agujero) and CAN.017 (Guayadeque) than any other individual from Gran Canaria. The other two Gran Canaria clusters are formed by individuals from potentially the same archaeological site. Individuals CAN.049 to CAN.051 belong to the Cendro site, while gun005 and gun008 belong to the University of Edinburgh Museum collection and were probably obtained from the same site, although given 5 the lack of archaeological context, this cannot be demonstrated. Finally, an important result is that putative migration between eastern and western islands is not observed, as no individual from either region shares IBD with the other cluster. shown in colored dots.             Analysis performed using four individuals from El Hierro, three individuals from La Palma; four individuals from La Gomera, 11 individuals from Tenerife, 23 individuals from Gran Canaria, two individuals from Lanzarote, and two from Fuerteventura. Color code for islands is as in 5 Supplementary Fig. 15. The squares represent the estimated contribution, while the whiskers represent its standard error.
Tenerife sub−sampling replicate Archipelago_SB Tenerife  Tenerife_avSB  Tenerife_rep1  Tenerife_rep10  Tenerife_rep2  Tenerife_rep3  Tenerife_rep4  Tenerife_rep5  Tenerife_rep6  Tenerife_rep7  Tenerife_rep8  Tenerife_rep9 Western_Islands Western_Islands_SB Founder intensity (%) event intensity for 10 replicates (_rep) including four individuals each and the average intensity of the 10 replicates (_avSB). The grey tile represents the intensity range of most island founder events according to ref. 130 , while the dotted line represents the founder intensity of Ashkenazi Jews (~1.7%). 5 Supplementary Fig. 36. Admixture estimations based on genome-wide data considering the global indigenous people (blue) and each insular indigenous population (red). Each dot represents the average admixture contribution of each population source (Canarian indigenous, Spanish or Yoruban) to each present-day island population, associated to their standard error. 5 This analysis has been performed using 34 present-day individuals from El Hierro; 35 from La